A Logistic Based Mathematical Model to Optimize Duplicate Elimination Ratio in Content Defined Chunking Based Big Data Storage System
نویسندگان
چکیده
Longxiang Wang 1, Xiaoshe Dong 1, Xingjun Zhang 1,*, Fuliang Guo 1, Yinfeng Wang 2 and Weifeng Gong 3 1 The School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China; [email protected] (L.W.); [email protected] (X.D.); [email protected] (F.G.) 2 The Shenzhen Institute of Information Technology, Shenzhen, 518172, China; [email protected] 3 State Key Laboratory of High-End Server & Storage Technology, Jinan 250101, China; [email protected] * Correspondence: [email protected]; Tel.: +86-029-8266-8478
منابع مشابه
A Dynamic Deduplication Approach for Big Data Storage
As data is increasing every day, so it is very challenging task to manage storage devices for this explosive growth of digital data. Data reduction has become very crucial problem. Deduplication approach plays a vital role to remove redundancy in large scale cluster computing storage. As a result, deduplication provides better storage utilization by eliminating redundant copies of data and savi...
متن کاملBimodal Content Defined Chunking for Backup Streams
Data deduplication has become a popular technology for reducing the amount of storage space necessary for backup and archival data. Content defined chunking (CDC) techniques are well established methods of separating a data stream into variable-size chunks such that duplicate content has a good chance of being discovered irrespective of its position in the data stream. Requirements for CDC incl...
متن کاملSurvey of Research on Chunking Techniques
The explosive growth of data produced by different devices and applications has contributed to the abundance of big data. To process such amounts of data efficiently, strategies such as De-duplication has been employed. Among the three different levels of de-duplication named as file level, block level and chunk level, De-duplication at chunk level also known as byte level is the most popular a...
متن کاملTwo Stage Max Gain Content Defined Chunking for De- duplication
––Data de-duplication is a very simple concept with very smart technology associated in it. The data blocks are stored only once, de-duplication systems decrease storage consumption by identifying distinct chunks of data with identical content. They then store a single copy of the chunk along with metadata about how to reconstruct the original files from the chunks, this takes up the less stora...
متن کاملOffline Selective Data Deduplication for Primary Storage Systems
Data deduplication is a technology that eliminates redundant data to save storage space. Most previous studies on data deduplication target backup storage, where the deduplication ratio and throughput are important. However, data deduplication on primary storage has recently been receiving attention; in this case, I/O latency should be considered equally with the deduplication ratio. Unfortunat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Symmetry
دوره 8 شماره
صفحات -
تاریخ انتشار 2016